Zero-Shot Scene Graph Relation Prediction Through Commonsense Knowledge Integration
نویسندگان
چکیده
Relation prediction among entities in images is an important step scene graph generation (SGG), which further impacts various visual understanding and reasoning tasks. Existing SGG frameworks, however, require heavy training yet are incapable of modeling unseen (i.e., zero-shot) triplets. In this work, we stress that such incapability due to the lack commonsense reasoning, i.e., ability associate similar infer relations based on general world. To fill gap, propose CommOnsense-integrAted sCene grapH rElation pRediction (COACHER), a framework integrate knowledge for SGG, especially zero-shot relation prediction. Specifically, develop novel mining pipelines model neighborhoods paths around external graph, them top state-of-the-art frameworks. Extensive quantitative evaluations qualitative case studies both original manipulated datasets from Visual Genome demonstrate effectiveness our proposed approach. The code available at https://github.com/Wayfear/Coacher.
منابع مشابه
Zero-shot Object Prediction using Semantic Scene Knowledge
This work focuses on the semantic relations between scenes and objects for visual object recognition. Semantic knowledge can be a powerful source of information especially in scenarios with few or no annotated training samples. These scenarios are referred to as zero-shot or fewshot recognition and often build on visual attributes. Here, instead of relying on various visual attributes, a more d...
متن کاملSemantic Graph for Zero-Shot Learning
Zero-shot learning aims to classify visual objects without any training data via knowledge transfer between seen and unseen classes. This is typically achieved by exploring a semantic embedding space where the seen and unseen classes can be related. Previous works differ in what embedding space is used and how different classes and a test image can be related. In this paper, we utilize the anno...
متن کاملZero-Shot Relation Extraction via Reading Comprehension
We show that relation extraction can be reduced to answering simple reading comprehension questions, by associating one or more natural-language questions with each relation slot. This reduction has several advantages: we can (1) learn relationextraction models by extending recent neural reading-comprehension techniques, (2) build very large training sets for those models by combining relation-...
متن کاملZero-Shot Recognition via Structured Prediction
We develop a novel method for zero shot learning (ZSL) based on test-time adaptation of similarity functions learned using training data. Existing methods exclusively employ source-domain side information for recognizing unseen classes during test time. We show that for batch-mode applications, accuracy can be significantly improved by adapting these predictors to the observed test-time target-...
متن کاملFrom Images to Sentences through Scene Description Graphs using Commonsense Reasoning and Knowledge
In this paper we propose the construction of linguistic descriptions of images. This is achieved through the extraction of scene description graphs (SDGs) from visual scenes using an automatically constructed knowledge base. SDGs are constructed using both vision and reasoning. Specifically, commonsense reasoning1 is applied on (a) detections obtained from existing perception methods on given i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2021
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-030-86520-7_29